classification entropy
Digital Divides in Scene Recognition: Uncovering Socioeconomic Biases in Deep Learning Systems
Greene, Michelle R., Josyula, Mariam, Si, Wentao, Hart, Jennifer A.
Computer-based scene understanding has influenced fields ranging from urban planning to autonomous vehicle performance, yet little is known about how well these technologies work across social differences. We investigate the biases of deep convolutional neural networks (dCNNs) in scene classification, using nearly one million images from global and US sources, including user-submitted home photographs and Airbnb listings. We applied statistical models to quantify the impact of socioeconomic indicators such as family income, Human Development Index (HDI), and demographic factors from public data sources (CIA and US Census) on dCNN performance. Our analyses revealed significant socioeconomic bias, where pretrained dCNNs demonstrated lower classification accuracy, lower classification confidence, and a higher tendency to assign labels that could be offensive when applied to homes (e.g., "ruin", "slum"), especially in images from homes with lower socioeconomic status (SES). This trend is consistent across two datasets of international images and within the diverse economic and racial landscapes of the United States. This research contributes to understanding biases in computer vision, emphasizing the need for more inclusive and representative training datasets. By mitigating the bias in the computer vision pipelines, we can ensure fairer and more equitable outcomes for applied computer vision, including home valuation and smart home security systems. There is urgency in addressing these biases, which can significantly impact critical decisions in urban development and resource allocation. Our findings also motivate the development of AI systems that better understand and serve diverse communities, moving towards technology that equitably benefits all sectors of society.
- North America > United States (0.67)
- Oceania > Samoa (0.04)
- Oceania > Pitcairn (0.04)
- (204 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Smart Houses & Appliances (0.54)
- Health & Medicine > Public Health (0.48)
- Banking & Finance > Economy (0.46)
Deep Bayesian Active Semi-Supervised Learning
Rottmann, Matthias, Kahl, Karsten, Gottschalk, Hanno
In many applications the process of generating label information is expensive and time consuming. We present a new method that combines active and semi-supervised deep learning to achieve high generalization performance from a deep convolutional neural network with as few known labels as possible. In a setting where a small amount of labeled data as well as a large amount of unlabeled data is available, our method first learns the labeled data set. This initialization is followed by an expectation maximization algorithm, where further training reduces classification entropy on the unlabeled data by targeting a low entropy fit which is consistent with the labeled data. In addition the algorithm asks at a specified frequency an oracle for labels of data with entropy above a certain entropy quantile. Using this active learning component we obtain an agile labeling process that achieves high accuracy, but requires only a small amount of known labels. For the MNIST dataset we report an error rate of 2.06% using only 300 labels and 1.06% for 1,000 labels. These results are obtained without employing any special network architecture or data augmentation.
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Guess-And-Verify Heuristics for Reducing Uncertainties in Expert Classification Systems
Qiu, Yuping, Cox,, Louis Anthony Jr., Davis, Lawrence
An expert classification system having statistical information about the prior probabilities of the different classes should be able to use this knowledge to reduce the amount of additional information that it must collect, e.g., through questions, in order to make a correct classification. This paper examines how best to use such prior information and additional information-collection opportunities to reduce uncertainty about the class to which a case belongs, thus minimizing the average cost or effort required to correctly classify new cases.
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Los Altos (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)